Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human

نویسندگان

  • Nicolas Obin
  • Christophe Veaux
  • Pierre Lanchantin
چکیده

The absence of alternatives/variants is a dramatical limitation of text-tospeech synthesis compared to the variety of human speech. This paper introduces the use of speech alternatives/variants in order to improve text-to-speech synthesis systems. Speech alternatives denote the variety of possibilities that a speaker has to pronounce a sentence depending on linguistic constraints, specific strategies of the speaker, speaking style, and pragmatic constraints. During the training, symbolic and acoustic characteristics of a unit-selection speech synthesis system are statistically modelled with context-dependent parametric models (GMMs/HMMs). During the synthesis, symbolic and acoustic alternatives are exploited using a GENERALIZED VITERBI ALGORITHM (GVA) to determine the sequence of speech units used for the synthesis. Objective and subjective evaluations support evidence that the use of speech alternatives significantly improves speech synthesis over conventional speech synthesis systems. Beyond, speech alternatives can also be used to vary the speech synthesis for a given text. The proposed method can easily be extended to HMM-based speech synthesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

ACTOR: A multilingual unit-selection speech synthesis system

The ACTOR® Text-To-Speech (TTS) synthesis system, developed at Loquendo S.p.A., is here described. The system employs a unit -selection concatenative synthesis technique, relying on labeled acoustic databases providing phonetic and prosodic coverage of the intended language/domain and on an original algorithm for run-time selection of the acoustic units to be concatenated. This technique yields...

متن کامل

Auditory/visual speech in multimodal human interfaces

Program in Experimental Psychology University of California Santa Cruz, CA 95064 ABSTRACT It has long been a hope, expectation, and prediction that speech would be the primary medium of communication between humans and machines. To date, this dream has not been realized. We predict that exploiting the multimodal nature of spoken language will facilitate the use of this medium. We begin our pape...

متن کامل

Design of English to Hindi Corpus Based Text Conversion and Hindi Text to Speech Synthesis

English is a global language but is understood by few percentage of population in India. It continues to remain a barrier for rural population to learn and compete at a global level. Machine translation helps people from different places to understand an unknown language without the aid of human translator. A Text to Speech system generatesspeech from text given as input. The proposed system wi...

متن کامل

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017